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INTRODUCTION 


In  this  paper  an  investigation  of  asymptotically  efficient  estimators 
for  linear  and  nonlinear  simultaneous  equation  econometric  models  is  under- 
taken.  By  using  an  instrumental  variable  approach  the  equivalence  of  pre- 
viously proposed  linear  estimators  to  full  information  maximum  likelihood 
(FIML)  follows  in  a  straightforward  manner,  and  a  class  of  new  estimators 
which  includes  a  nonlinear  three  stage  least  squares  estimator  (NL3SLS) 
and  nonlinear  full-information  instrumental  variables  estimator  are  pro- 
posed and  shown  to  be  asymptotically  equivalent  to  FIML. 

First,  an  instrumental  variable  interpretation  of  FIML  is  developed 
by  investigating  the  first  order  conditions  for  the  maximum  of  the  likeli- 
hood function  without  first  concentrating  the  likelihood  function.  The 
essential  difference  between  3SLS  and  FIML  then  becomes  evident.  The  dif- 
ference between  the  two  estimators  is  that  FIML  uses  all  over-identifying 
restrictions  in  forming  the  instruments  while  3SLS  ignores  some  of  these 
restrictions.  While  this  difference  in  forming  the  instruments  is  of  no 
importance  asymptotically  as  is  known  by  the  earlier  results  of  Sargan  [9] 
and  Rothenberg  and  Leenders  [8],  in  finite  samples  there  seems  no  reason 
not  to  use  all  known  prior  information.  The  a  priori  restrictions  give 
a  more  useful  criterion  than  Dhrymes'  [3]  recent  interpretation  of  a  dif- 
ference in  'purging'  the  endogenous  variables  since  all  other  proposed 
estimators  can  be  shown  to  be  equivalent  by  simply  proving  asymptotic 
equivalence  of  the  instruments  used  to  those  instruments  used  by  FIML 
estimator. 

The  next  result  is  to  derive  the  necessary  conditions  on  the  number 
of  observations  to  permit  computation  of  the  FIML  estimate.  The  FIML  esti- 
mate can  be  computed  in  a  class  of  cases  where  all  efficient  limited  infor- 


(2) 

mation  estimators  such  as  two  stage  least  squares  (2SLS)  and  limited 
information  maximum  likelihood  (LIML)  are  infeasible.   3SLS  is  also 
infeasible  in  this  class  of  cases.   The  reason  that  the  FIML  estimate 
exists  is  again  that  all  a  priori  restrictions  are  used  while  the  other 
estimators  neglect  some  over-identifying  restrictions  in  forming  the 
instruments.  Thus  a  partial  solution  to  the  much  studied  problem  of 
simultaneous  equation  estimation  with  undersized  samples  is  given.   Pre- 
vious authors  in  their  almost  exclusive  attempts  to  extend  limited  infor- 
mation methods  failed  to  realize  that  an  appropriate  full  information 
method,  by  using  all  prior  information,  could  make  estimation  possible. 
Also,  I  point  out  an  error  of  Klein  [4]  on  degrees  of  freedom  restrictions 
for  FIML  estimation.   I  establish  that  FIML  has  less  stringent  degrees  of 
freedom  requirements  than  other  estimators  rather  than  more  stringent 
requirements  as  he  asserted. 

Then  using  the  instrumental  variable  interpretation,  a  relation 
between  FIML  and  the  class  estimators  recently  proposed  by  Dhrymes  [2], 
Lyttkens  [5],  and  Brundy  and  Jorgenson  [1]  is  established.  The  full 
information  instrumental  variable  estimators  are  shown  to  be  special  cases 
of  the  basic  FIML  iteration.   Furthermore,  if  they  are  iterated  and  con- 
verge, the  resulting  estimates  are  the  FIML  estimates. 

Lastly,  FIML  is  considered  in  the  nonlinear  case;  arid  it  is  shown 
that  in  the  special  case  of  nonlinearity  in  the  parameters  the  instru- 
mental variable  interpretation  can  be  extended  to  provide  an  asymptotically 
efficient  estimator  with  less  computation  needed  than  the  FIML  estimator. 
In  a  similar  way  a  nonlinear  three  state  least  squares  estimator  is  pro- 
posed and  demonstrated  to  be  asymptotically  equivalent  to  FIML.   NL3SLS 
again  neglects  some  over-identifying  restrictions  in  forming  the  instru- 
ments so  that  in  finite  sample  the  instrumental  variable  estimator  which 
uses  all  the  restrictions  seems  preferable. 
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2.   Specification  and  Assumptions  for  the  Linear  Case 

The  standard  linear  simultaneous  equations  model  is  considered  first, 
where  all  identities  are  assumed  to  have  been  substituted  out  of  the  system 
of  equations: 

(1)  YB  +  zr  =  U. 

where  Y  is  the  T  x  M  matrix  of  jointly  dependent  variables,  Z  is  the  T  x  K 
matrix  of  predetermined  variables,  and  U  is  a  T  x  M  matrix  of  the  structural 
disturbances  of  the  system.   The  model  thus  has  M  equations  and  T  observations. 
It  is  assumed  that  B  is  nonsingular,  rk(Z)  =  K,  and  that  all  equations  satisfy 
the  rank  condition  for  identification.   Also  if  lagged  endogenous  variables 
are  included  as  predetermined  variables,  the  system  is  assumed  to  be  stable. 
Lastly,  an  orthogonality  assumption,  E(Z'U)  ■  0,  between  the  predetermined 
variables  and  structural  errors  is  required;  and  the  second  order  moment 
matrices  of  the  current  predetermined  and  endogenous  variables  are  assumed 
to  have  non-singular  probability  limits. 

The  structural  errors  are  assumed  to  be  mutually  independent  and 
identically  distributed  (iid)  as  a  nonsingular  M-variate  normal  (Guassian) 
distribution: 


(2)  U  %  N(0,  Z®IT) 


where  T.   is  positive  definite  almost  surely,  and  no  restrictions  are  placed 
on  Z.   Thus  for  the  present  we  assume  the  presence  of  contemporaneous 
correlation  but  no  intertemporal  correlation.   The  (column)  vectors  of  U 
are  thus  distributed  as  univariate  normal,  U.  ^  N(0,  Z) . 


(4) 


Now  the  identification  assumptions  will  exclude  some  variables  from 
each  equation  so  let  r  and  s  denote  the  number  of  included  jointly  de- 
pendent and  predetermined  variables,  respectively,  in  the  i   equation. 
Then  rewriting  (1)  after  choice  of  a  normalization  rule: 


(3)  y  -  X±  6.  +  U, 


(±  =  1,  2 M) 


where 


Xi  =  tYi  Zi] 


6i  = 


i 


so  that  X,  contains  the  t.  ■  r.  +  s.  -  1  variables  whose  coefficients  are 
i  l    i    i 

not  known  a  priori  to  be  zero.   It  will  prove  convenient  to  stack  these  M 
equations  into  a  system: 

(4)  y  =  X6  +  u 


where    y  = 


7M 


,    X 


xx   0 


,    <5  = 


0    *M 


M 


u  = 


M 


(5) 

3.   An  Instrumental  Variable  Interpretation  of  FIML 

The  technique  used  to  derive  an  instrumental  variable  interpretation 
of  FIML  is  similar  to,  but  not  identical  with,  a  proposal  by  Durbin  in  an 
unpublished  paper.   While  not  deriving  Durbin' s  result  from  the  likelihood 
function,  Malinvaud  states  the  estimator  which  he  calls  'Durbin' s  Method' 
[7,  pp.  686-7].   However,  the  resulting  estimator  differs  from  the  es- 
timator proposed  here  by  not  making  full  use  of  the  identifying  restric- 
tions and  being  identical  only  in  the  case  of  a  just-identified  system. 
The  instrumental  variable  interpretation  of  a  maximum  likelihood  estimator 
while  known  in  the  case  of  non-simultaneous  equation  models  is  here  ex- 
tended to  the  case  of  FIML  thus  giving  an  integrated  method  in  which  to 
interpret  the  many  estimators  proposed  for  econometric  models. 

Given  assumption  (2)  the  likelihood  function  of  the  sample  is 


%  -MT/2       -T/2       T 

(5)  L(B,  r,  Z)  =  (211)   '      det(Z)  i/Z  det(B) 


exp  [-  -|  tr(YB  +  Zf)  '  Z  1  (YB  +  Zf)  ] 


Taking  logs  and  rearranging,  we  derive  the  function  to  be  maximized 


(6)  L(B,  T,  Z)  =  C  +  |  log  det(Z)"1  +  T  log  det(B) 


-  |  tr  [|  Z'1    (YB  +  Zf)'  (YB  +  ZD  ] 


where  the  constant,  C  may  be  disregarded  in  maximizing  the  likelihood 

function.   Since  no  restrictions  have  been  placed  on  the  elements  of  Z, 

the  usual  procedure  is  to  'concentrate'  the  likelihood  function  by  partially 


(6) 

maximizing  the  function  with  respect  to  E.   This  procedure  sets  E  = 
T~  (YB  +  ZT)'  (YB  +  ZT)  and  thus  eliminates  E  from  the  likelihood  function, 
leaving  a  function  L  (B,  V)    to  be  maximized.   Our  procedure  instead  concen- 
trates on  the  presence  of  the  Jacobian  det(B)  in  the  likelihood  function 
which  differentiates  the  simultaneous  equation  problem  from  the  Zellner 
[10]  multivariate  least  squares  problem.   For  if  the  Jacobian  of  the  trans- 

formation  from  U  to  Y,   /^vj   were  an  identity  matrix,  the  maximum  likelihood 

oY 

estimator  would  be  the  generalized  least  squares  estimator.   Also  it  will 
be  seen  in  a  later  section  that  the  Jacobian  is  crucial  in  the  development 
of  a  non-linear  FIML  estimator. 

To  maximize  the  log  likelihood  function  L(B,  T,  E)  ,  the  necessary  condi- 
tions for  a  maximum  are  the  first  order  conditions  obtained  by  differentiating 
(6)  using  the  relation  81og  det(A)/9A  =  (A')'1.  Note  that  the  a  priori  re- 
strictions have  been  imposed  so  that  only  elements  corresponding  to  non-zero 
elements  of  B  and  T   are  set  equal  to  zero: 


3T  —1  —1 

(7)  -^  :   T(B')   -  Y   (YB  +  zr)  E  x  =  0 


3T  —1 

(8)  |^  :  -  Z'  (YB  +  Zr)  E  X  =  0 


3L 
(9)  -r=-  :   TE  -  (YB  +  ZO'  (YB  +  Zr)   =  0 

OL 


Concentration  of  the  likelihood  function  follows  from  solving  for  E  in 
equation  (9);  here  we  solve  for  T  using  equation  (9).   Since  the  M-variate 
distribution  has  been  assumed  non-singular,  from  equation  (2)  E  is  positive 
definite  almost  surely  so  from  equation  (9), 


(10)  T'i  =  (yb  +  zr)'  (YB  +  zr)  e-1 


(7) 

Substituting  this  result  for  the  first  term  in  equation  (7)  yields 


(11)  (B')  X    (YB  +  ZT)'  (YB  +  zr)  Z   l   -  Y'    (YB  +  ZT)    E_1  =  0. 


The  first  term  in  (11)  represents  the  presence  of  the  non-identity  Jacobian, 
but  this  term  can  be  simplified  by  rearranging  to  get 


(12)  [(B')  L  B'Y'  +  (B')  X    r'Z'][YB  +  ZT]  Z  * 


-  y'  (yb  +  zr)  Z   1  =  0. 


Noting  that  in  equation  (12)  the  first  and  last  terms  are  identical  with 
opposite  sign,  we  have  the  desired  first  order  condition 


(13)  (B')  1   F'Z'  (YB  +  Zr)  Z_1  =  0 


Therefore  equations  (8)  and  (13)  must  be  solved  and  'stacking'  them  together 
yields  the  final  form  of  the  necessary  conditions 


(14)  [-1'  \    (YB  +  ZT)  I  1   -  0. 

(B')_1  r'z- 


Rewriting  equation  (14)  in  the  form  of  equation  (4),  the  FIML  estimator  6 
of  the  unknown  elements  of  6  in  instrumental  variable  form  is: 


(is)  6  =  (wx)  -1  VT 


where  the  instruments  are 


(8) 

(16)  W  =  X  (Sg)^)"1. 

The  elements  of  W  are  then 


(17)  X  =  diagO^,  X2 ,  Xjj),  X±  =  [Z(TB  1)±  Z±] 


and  from  equation  (9) 


(18)  S  =  T_1  (YB  +  Z?)'  (YB  +  ZT) 


The  instrumental  variable  interpretation  of  equations  (15)  and  (16)  is 
immediate  since  the  second  order  moment  matrices  exist  and  are  non-singular, 
and  by  the  orthogonality  assumption  E(Z'U)  =0.   In  the  instrumental  vari- 
able interpretation  of  generalized  least  squares  where  only  predetermined 
variables  appear  in  X,  the  instruments  are  al L  the  predetermined  variables 
W  =  Z  (S  x  I)  while  here  the  included  endogenous  variables  are  replaced 
by  consistent  estimates  which  are  then  used  as  the  instruments. 

Equation  (15)  is  non-linear  since  both  X  and  S  depends  on  B,  r  which 
are  elements  of  6  and  would  therefore  be  solved  by  an  iterative  process 
('Durbin's  Method')  where  subscripts  here  denote  iteration  number: 


(l9)  K+i  -  (*k x)_1  v- 


The  limit  of  the  iterative  process,  if  it  converges,  6*,  is  the  FIML  estimate 


*  "  —1    —1 

with  asymptotic  covariance  matrix  (X*  (S*(^>  I„)   X)   since  asymptotically 


-1N 


(20)  /f  (6  -  6)  *  N  (0,  V  ) 


(9) 

i  ;)2l 

where  V  =  -  plim  E[—  ^asa.S'-'"  T^us  equation  (15)  extends  the  concept  of 
instrumental  variables  to  the  maximum  likelihood  estimation  of  simultaneous 
equation  models  so  that  very  simple  comparisons  with  other  proposed  estima- 
tors are  possible. 


(10) 

4.   Equivalence  of  FIML  and  3SLS 

An  instrumental  variable  interpretation  of  3SLS  was  first  advanced 
by  Madansky  [6].   In  this  interpretation  the  3SLS  estimator  has  the  form 


<21>  «3SLS  =  ®'   X)_1  "'   y 


where  here  the  instruments  are 


(22)  W  =  Z  (Si(£>Z'  Z)  1  Z'   X 


a, 
The  elements  of  W  are 


(23)  X  =  diag  (Xr  .....  XM),  X±  -  [Yi  Z1]. 

A/ 
and  S  is  the  consistent  estimate  of  E  derived  from  the  residuals  of  the 

structural  equations  estimated  by  2SLS.   The  essential  differences  between 

FIML  and  3SLS  may  be  discovered  by  an  examination  of  the  difference  in 

instruments  between  equation  (16)  and  equation  (22).   The  first  difference 

is  the  consistent  estimation  of  the  variance  covariance  matrix  E.   They 

are  asymptotically  equivalent  in  probability  limit  since  by  consistency 

(24)  plim  S  =  plim  s(  =  S. 

The  second  difference  is  that  the  FIML  estimator  uses  all  a  priori 
restrictions  in  computing  the  instruments,  while  as  seen  from  equation 
(22),  3SLS  uses  an  unrestricted  estimate  in  computing  the  instruments. 
Again  asymptotic  equivalence  follows  since 


(11) 


(25)  plim  f  plim  B  1  =  plim  (Z'Z)  1  Z'   Y 


-_1 
since  plim  Z'U  =  0  and  T,   B   have  finite  probability  limits  by  assumption. 

Therefore  the  two  differences  lies  in  not  making  complete  use  of  the  identi- 

—     % 
fying  restrictions  in  estimating  the  instruments  W  and  W  and  in  different 

estimates  of  the  covariance  matrix  E.   In  finite  samples  the  two  methods 

can  be  equivalent  only  in  the  just  identified  case  since  the  instruments 

would  then  be  identical.   Lastly  the  equivalence  results  of  Sargan  [9]  and 

Rothenberg  and  Leenders  [8]  are  obtained  without  the  necessity  of  an 

—     <\j 
asymptotic  expansion  by  the  result  that  W  and  W  are  equivalent  in  the 

probability  limit.   Furthermore,  any  other  asymptotically  efficient  in- 
strumental variable  estimator  may  be  proved  equivalent  to  FIML  by  the 
same  technique. 


(12) 
5.   The  Incorrectness  of  Klein's  Degrees  of  Freedom  Restrictions 

In  the  finite  sample  case  it  is  known  that  important  restrictions 
on  the  use  of  3SLS  and  efficient  limited  information  estimators  such  as 
LIML  and  2SLS  are  degrees  of  freedom  restrictions.   The  binding  restriction 
is  usually  that  the  number  of  observations  T  must  be  no  less  than  the  total 
number  of  predetermined  variables  in  the  system  K.   Evaluation  of  the  inner 
terms  of  the  3SLS  instruments 


(26)  (S®Z-Z)  1  =  (S_1(g)(Z'Z)  1) 


and  use  of  the  elementary  result 

(27)  Lemma  1:   rk  (AB)  <  min  (rk  (A),  rk  (B)) 

implies  that  rk  (Z'Z)  <  min  (K,  T)  and  so  for  Z'Z  to  be  of  full  rank  it 
is  necessary  that  K  <  T.   We  now  show  that  this  degrees  of  freedom  restriction 
is  not  binding  for  FIML  and  develop  exact  degrees  of  freedom  restrictions. 
Consideration  of  the  instruments  W  in  equation  (16)  implies  that  S   must 
exist.   From  equation  (18)  S  =  T   U"  U  such  that  application  of  lemma  1 
results  in  the  condition  that  rk  (S)  <  min  (M,  T) .   Therefore  a  necessary 
condition  for  estimability  of  FIML  is  M  <  T.   That  this  condition  is  almost 
surely  sufficient  follows  from 

(28)  Lemma  2:Let  X  ,  . . . . ,  X  be  a  random  sample  from  an  absolutely  con- 
tinuous M-variate  distribution  with  non-singular  covariance  matrix  E.   Then 
the  sample  covariance  matrix  S  is  positive  definite  with  probability  one 
iff  T  >  M. 


(13) 

Proof:   Necessity  follows  from  straightforward  application  of  lemma  1  (27) 
and  sufficiency  comes  from  the  following  argument.   The  moment  matrix  is 
positive  semi-definite  and  the  determinant  vanishes  only  if  the  observations 
are  linearly  dependent.   But  if  the  joint  distribution  of  the  observations 
is  absolutely  continuous  given  Z  non-singular,  the  probability  of  an  exact 
linear  relationship  is  zero. 

After  insuring  the  nonsingularity  of  S,  the  only  remaining  task  is  to 
derive  conditions  for  the  nonsingularity  of  (W'  X).   The  necessary  condition 
here  after  again  using  Lemma  1  (27)  is  that 


(29)  T  >  r±  +  s±   -  1     for  all     i  =  1,  ...,  M. 


This  condition  is  just  the  usual  least  squares  condition  that  the  number  of 
'right  hand  side1  variables  must  not  exceed  the  number  of  observations.   It 
is  not  often  a  binding  restriction  in  FIML  estimation.  As  3SLS  must  also 
satisfy  the  condition  of  Lemma  2  since  S  is  a  moment  matrix,  it  is  seen  by 
application  of  the  order  condition  for  identification  that  the  conditions 
in  finite  samples  for  the  FIML  estimate  are  weaker  than  those  for  3SLS 
estimation.   The  order  condition  of  identification  states  that  for  each 
equation  the  number  of  predetermined  variables  excluded  from  the  equation 
must  be  at  least  as  great  as  one  less  than  the  number  of  endogenous 
variables  included  in  the  equation 


(30)  K  -  s  ■■>  r.  -  1     for  all     i  =  1,  ...,  M. 


Rearranging  gives  K  >  s.  +  r.  -  1  and  since  3SLS  requires  T  >  K  it  is 
seen  that  also  for  3SLS,  T  >  r±  +  s±   -  1  for  all  i.   Thus  collecting 


(14) 

results  we  have  the  following: 

Theorem  1:   Under  the  assumptions  for  the  linear  simultaneous  equations 
model  of  Section  2,  the  following  conditions  are  necessary  and  almost 
surely  sufficient  for  the  estimability  of  the-  FIML  estimate:   (i)  the 
number  of  observations  must  be  at  least  as  great  as  the  number  of  endo- 
genous variables,  M  <  T  (ii) .   For  each  equation,  the  number  of  observa- 
tions must  be  at  least  as  great  as  the  number  of  included  "right  hand 

side"  variables  after  normalization,  T  >  r .  +  s .  -  1  for  all  i  =  1,  . . . ,  M. 

-  i    l  ' 

the  estimability  of  3SLS  requires  a  strengthening  of  condition  (ii)  so  that 
the  number  of  observations  must  be  at  least  as  great  as  the  total  number  of 
exogenous  variables,  T  >  K.   Lastly,  the  FIML  and  3SLS  estimates  are  identical 
in  the  just-identified  case. 

These  necessary  conditions  are  in  conflict  with  those  of  Professor  Klein 
[4,  p.  175-6]  who  places  the  following  "degrees  of  freedom"  restrictions 
on  FIML  estimation: 

,      M 
(31)  (i)  K  <  T     (ii)  M  <  T     (iii)   K  +  M  <  T     (iv)   E   (r  +s.-l)  <  MT 

i=l      X 

Except  for  using  a  strong  rather  than  weak  inequality,  Klein's  condition  (ii)' 
is  identical  to  our  condition  (i) .   Condition  (iv)'  corresponds  to  (ii)  when 
both  sides  are  summed  over  all  equations,  but  it  is  too  weak;  the  condition 
must  hold  for  each  equation.   Conditions  (i)'  and  (iii)'  would  place  greater 
restrictions  on  FIML  estimation  than  3SLS  if  correct;  but  Professor  Klein 
is  incorrect  in  claiming  that  the  moment  matrix  including  all  endogenous 
and  predetermined  variables  must  be  nonsingular  (his  WW  matrix  on  p.  176). 


(15) 

He  has  neglected  to  impose  the  a  priori  restrictions  which  enter  in  equa- 
tions (15)  and  (16)  and  this  make  the  requirements  for  estimation  of  FIML 
weaker,  not  stronger,  than  3SLS  as  he  implicitly  asserts.   Intuitively, 
this  result  again  follows  from  the  difference  in  instruments  for  FIML 
and  3SLS,  W  and  W,  respectively.   FIML  imposes  all  a  priori  restrictions 
in  computing  the  instruments  while  3SLS  neglects  these  restrictions.   That 
FIML  imposes  all  a  priori  restrictions  is  also  the  reason  for  its  es- 
timability  when  efficient  limited  information  methods,  2SLS  and  LIML,  cannot 
be  used.   Since  they  too  treat  all  equations  but  the  one  being  estimated 
as  just  identified,  they  will  have  two  of  the  restrictions  of  3SLS:   T  >  K 
and  T  >  r  +  s.  -  1  for  all  equations  to  be  estimated.   Thus  in  many  actual 
cases  where  limited  information  estimation  or  3SLS  estimation  is  impossible, 
the  full  information  maximum  likelihood  estimate  can  be  computed. 


(16) 


6.   The  Relationship  of  FIML  to  Recently  Proposed 
Instrumental  Variable  Estimators 

Three  recent  papers  have  proposed  instrumental  variable  estimators 
for  linear  simultaneous  equation  systems.   Here  these  estimators  are  all 
shown  to  be  particular  cases  of  the  basic  FIML  iteration  developed  in 
equation  (19).   Lyttkens  [5],  and  Dhrymes  [2],  and  Brundy  and  Jorgenson's 
[1]  estimators  all  have  the  form: 

(i)  Construct  a  consistent  estimate  of  the  structural  parameters 
(6,  I) .   These  initial  consistent  estimates  may  be  obtained  by  the  use  of 
consistent,  but  possibly  inefficient,  instrumental  variable  estimators 
using  the  format  of  equation  (3).   This  procedure  is  always  possible  so 
long  as  T  >  r.  +  s.  -  1  for  all  1=1,  ...,  M  which  is  condition  (ii)  of 
Theorem  1.   In  constructing  the  instruments  for  equation  i,  W.,  to  insure 
consistency  it  is  necessary  to  include  all  s .  predetermined  variables  from 
equation  i  as  instruments.   The  remaining  r  -  1  instruments  can  be  con- 
structed by  regressing  the  r  -  1  jointly  dependent  variables  in  equation 
i  on  a  subset  of  all  the  excluded  predetermined  variables .   By  the  ortho- 
gonality assumption,  E(Z'U)  =  0,  the  estimates  6  will  be  consistent  but, 
in  general,  not  efficient  estimates.   This  procedure  is  followed  for  all 
M  equations;  and  S,  a  consistent  estimate  of  E,  is  derived  from  the  resi- 
duals of  the  structural  equations  in  the  usual  manner. 


1.   Lyttken's  method  does  not  compute  S,  but  rather  uses  the  identity 

matrix.   Thus  his  estimator  is  consistent  but  not  generally  efficient. 
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(ii)  Construct  system  instrumental  variables  W  using  the  form  of 


equation  (16) ,  W  =  X  (S(j$!  I  )   .   Consistent  estimates  of  X  are  provided 
from  the  first  step  of  the  procedure  since  by  definition  6  =  [R  y. ] '  and 
from  equation  (16)  X  =  [Z  (r  B  )   Z  ].   Note  that  all  a  priori  restrictions 
are  being  imposed  to  estimate  the  instrumental  variables  W  rather  than  un- 
restricted estimates  as  in  k-class  and  3SLS  instruments  W  as  shown  in 
equation  (22). 


(iii)  Estimate  the  structural  parameters  as  in  equation  (19), 

-1  - 
6  =  (W  X)   W  y.   If  desired,  compute  efficient  estimates  of  £  and 

the  reduced  form  parameters. 


Brundy  and  Jorgenson  stop  at  this  point  and  have  efficient  estimates  since 
their  estimates  converge  in  distribution  to  the  FIML  estimates  by  an 
identical  argument  as  that  of  equations  (24)  and  (25).   Lyttkens  and 
Dhrymes  propose  an  iterative  process  between  steps  (ii)  and  (iii)  while 
unaware  of  the  properties  of  the  final  estimates.   But  since  this  procedure 
is  in  every  way  identical  to  equation  (19) ,  by  the  earlier  derivation  if 
the  iteration  converges  the  estimates  (6*,  S*)  are  the  FIML  estimates! 
Thus  these  iterated  instrumental  procedures  will  be  numerically  identical 
to  FIML  if  both  use  identical  initial  consistent  estimates.   Thus  Dhrymes' 
[2]  question  of  the  effect  of  the  initial  estimates  used  in  step  (i)  is 
answered  for  small  samples;  and  for  large  samples  even  without  identical 
initial  estimates,  under  the  usual  regularity  conditions  the  Cramer-Rao 
theorem  can  be  invoked  to  insure  a  unique  maximum  likelihood  estimate 
almost  surely. 

Also,  note  that  the  so-called  limited  information  procedure  proposed 
by  Brundy  and  Jorgenson  is  misnamed.   The  procedure  is  identical  to  Lyttkens 
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in  using  the  identity  matrix  as  an  estimate  of  the  contemporaneous 
correlation  matrix  E.   This  procedure  is  not  limited  information  since 
it  utilizes  all  the  a  priori  restrictions  on  the  6  in  estimating  the 
instrumental  variables  of  step  (ii) .   Thus  any  error  of  misspecif ication 
will  be  propogated  throughout  the  entire  system  rather  than  being  confined 
to  the  equation  in  which  it  occurs  as  in  time  limited  information  methods. 
Since  the  a  priori  restrictions  are  being  imposed,  FIML  or  its  one  iteration 
special  case  might  as  well  be  used  to  provide  fully  efficient  estimates 
rather  than  only  consistent  estimates  which  the  Brundy-Jorgenson  'limited 
information'  procedure  gives. 

Lastly,  while  multicollinearity  often  makes  computation  of  the  un- 
restricted  instrumental  variables,  W,  used  in  3SLS  as  in  equation  (22) 
extremely  difficult,  since  FIML  and  the  single  iteration  procedures  use 
fully  restricted  estimates  W  an  in  equation  (17)  this  problem  will  no 
longer  exist.   Thus  in  the  full  information  context,  procedures  using 
principal  components  need  not  be  used  for  the  multicollinearity  problem. 
Also,  in  the  finite  sample  case  since  all  a  priori  restrictions  are  being 
imposed  these  instrumental  variable  procedures  might  well  be  preferred  to 
3SLS  which  imposes  the  restrictions  only  in  the  final  stage.   In  3SLS  the 
estimates  of  the  included  right  hand  side  endogenous  variables  often  differs 
little  from  the  actual  and  presumably  non-orthogonal  variables  due  to 
lack  of  degrees  of  freedom,  but  FIML  and  the  instrumental  variable  pro- 
cedures by  imposing  all  the  a  priori  restrictions  will  often  have  many 
more  degrees  of  freedom  in  estimation. 
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7.   FIML  for  Nonlinear  Systems 
Consider  the  general  nonlinear  simultaneous  equation  system 

(32)  F(Y,  Z;  a)  =  U. 

Here  F  is  an  MT  vector  of  functions  [f  ,  f  ,  f  ]  which  in  a  neighbor- 
hood in  R  dimensional  space  of  the  true  parameter  values,  a*,  are  assumed 
uniformly  bounded  and  three  times  dif ferentiable  with  uniformly  bounded 
derivatives.  Also,  the  f  are  assumed  continuous  with  respect  to  Y  and  Z. 
The  system  is  assumed  to  be  identified  and  ex  to  belong  to  a  compact  subset 
of  R  dimensional  space.   As  before  the  structural  errors  are  assumed  i.i.d. 
and  distributed  as  a  nonsingular  M-variate  normal  distribution. 
The  log  of  the  likelihood  function  is 


T 
(33)  L(a,  E)  =  C  +  T/   log  det  (E)_1  +  E   log  |  J 

t=l        C 


|  tr  [i  E  X  F  (Y,  Z;  a)'  F  (Y,  Z;  a)] 


where  |j  |  is  the  Jacobian  of  the  transformation  from  U  to  Y.   Note  the 
important  complication  introduced  by  the  nonlinear  structural  system  is 
that  in  equation  (33)  the  Jacobian  is  no  longer  constant  as  in  equation 
(6)  but  instead  varies  with  each  observation.   Therefore,  the  first  order 
conditions  cannot  be  simplified  as  in  equations  (11)  and  (12)  to  provide 
a  convenient  iterative  procedure.   The  log  likelihood  in  principle  can  be 
maximized  by  straightforward  'hill-climbing'  algorithems  but  this  procedure 
may  prove  impractical  unless  special  assumptions  are  made  about  the  structural 
system. 
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One  very  special  case  of  the  general  moc'el,  which  nevertheless  is  quite 
common  to  econometric  models,  is  that  of  non-linearity  only  in  the  parameters. 
Here  the  structure  is  linear  in  the  variables  and  nonlinear  in  the  parameters 
A  which  are  analytic  non-linear  functions  of  an  R  dimensional  vector  of 
parameters  a  so  that 

(34)  U  =  XA(a)  =  YB(a)  +  zr(a) 

where  X  =  [Y  Z].   Two  important  examples  of  such  a  structure  are  linear 
simultaneous  equation  models  with  autoregressive  errors  and  partial  adjust- 
ment or  distributed  lag  models  containing  a  'desired'  stock  which  is  a 
function  of  structural  parameters.   Writing  out  the  log  of  the  likelihood 
function  where  B(a)  are  the  parameters  of  the  endogenous  variables  gives 


(35)  L  (a,  I)  =  C  +  |  log  det  (I)  1  +  T  log  det  (B  (a)) 


|  tr  [f  Z"1  (XA(a)r  (XA(a))] 


The  Jacob ian  is  once  again  constant  and  the  irst  order  conditions  are 


(36)  f  :   T^a))"1  ||  -  (ff )  V  (XA(a))  z"1  =  0 


(37)  f^  :      TL  -    (XA(a)r  (XA(a))  =  0 


Noting  that   /_   is  a  submatrix  of   / ^     and  using  the  same  substitution 

9a  9a 

technique  of  equations  (11)  and  (12)  yields  the  non-linear  iterative  equation 
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<38>  vi =  (V x '  i )_1  V  y 


where  the  instruments  are 


(39)  W  =  ||  •  X  (S(g>IT)  1. 


The  elements  of  W  are 


(40)  X  =  dlag  (Xr  X2 XM),  X±  =  [Z  (f(a)  B(a)  1)±  Z±] 


and  from  equation  (37) : 


(41)  S  =  T  X  (XA(a))'  (XA(a)) 


The  limit  of  the  iterative  process  a*   is  the  FIML  estimate  with  asymptotic 
covariance  matrix  (3A/„   •  X*'  (.S*®!-)'1   •  X  ■  8V  )_1 

oCt  1  OCX 

In  Section  4  the  instrumental  variable  interpretation  of  3SLS  was 
given  and  its  asymptotic  equivalence  to  FIML  shown.   In  a  similar  manner 
for  the  non-linear  in  parameters  system  of  equation  (34)  a  non-linear 
3SLS  estimator  (NL3SLS)  may  be  defined 


%      3A  -1  % 


with  the  instruments 


(43)  W  =  Z  (S®Z'Z)  1  Z'   X  |^ 

da 


(22) 

where  S  is  a  consistent  estimate  of  Z.   Note  that  the  NL3SLS  estimator  is 

9A 
nonlinear  due  to  the  presence  of  the  —  matrix  and  will  therefore  require 

cm 

an  iterative  procedure.   However,  it  will  require  less  computation  than 
the  FIML  estimator  since  the  large  block  Z  (S(x)  Z'Z)   Z'   X  remains 
constant  while  FIML  revises  X  on  each  iteration.   The  asymptotic  equiv- 
alence of  NL3SLS  and  FIML  follows  from  application  of  equation  (25)  with 
the  asymptotic  covariance  matrix  of  NL3SLS  being  Or-  X'  Z  (S^Z'Z)"  Z'  X  — ) 

For  asymptotic  efficiency  there  is  no  need  to  define  a  non-linear  2SLS 
estimator  for  the  computation  of  S.   Any  consistent  estimate  will  do;  and  in 
particular,  the  estimate  derived  by  not  imposing  the  across-parameter  con- 
straints is  easily  shown  to  be  consistent.   For  example  in  the  autoregressive 
case,  a  parameter  from  the  autoregressive  specification  will  usually  multiply 
more  than  one  of  the  other  parameters.   An  unconstrained  estimate  which  treats 
each  of  the  terms  as  different  will  yield  consistent  estimates  of  the  dis- 
turbances from  which  a  consistent  estimate  of  E  follows  in  the  usual  way. 

Another  estimator  which  is  asymptotically  efficient  and  uses  more  re- 
strictions in  the  estimation  of  the  instruments  than  does  3SLS  is  the  non- 
linear analogue  of  the  Lyttkens,  Dhrymes,  and  Brundy  and  Jorgenson  procedures. 
It  corresponds  to  one  step  of  the  FIML  iteration: 

(i)  Construct  a  consistent  estimate  of  the  structural  parameters 
(A(a) ,  Z).   A  consistent,  but  inefficient,  instrumental  variables  procedure 
on  each  of  the  M  equations  in  which  the  nonlinear  constraints  are  not  im- 
posed is  the  most  simple  procedure.   A  consistent  estimate  S  follows  in 
the  usual  manner. 

(ii)  Use  these  consistent  estimates  to  form  instruments  with  equation 
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(40)  defining  X  and  the  iterative  procedure 


(44)  a   =  (W  X  ~)    1   W  y. 

da 


Here  the  instruments  W  remain  constant  with  respect  to  the  X  term  while 

the  —  matrix  changes  on  each  iteration  until  convergence  is  achieved. 

<3ot 

Thus  the  iterative  procedure  differs  from  FIML  where  X  is  also  changing 
with  each  iteration. 


Thus  three  instrumental  variable  estimators  have  been  proposed  to 
treat  the  non-linear  in  parameters  case:   FIML,  NL3SLS,  non-linear  instru- 
mental variables.   Each  provides  asymptotically  efficient  estimates  with 
FIML  requiring  the  most  computation  since  both  X  and  —  are  changing 

aO, 

across  iterations.  The  other  two  procedures  keep  X  constant  and  iterate 
only  over  — .  As  before,  on  a  consideration  of  degrees  of  freedom  the 
non-linear  instrumental  variables  procedure  might  be  preferred  to  NL3SLS. 
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8.   Conclusions 


The  conceptual  framework  of  instrumental  variables  has  been  used  to 
demonstrate  the  close  relation  of  FIML,  3SLS,  and  recently  proposed  in- 
strumental variable  procedures.   While  Madansky  had  established  an  in- 
strumental variable  interpretation  of  3SLS,  the  instrumental  variable 
interpretation  of  FIML  is  new  and  leads  to  an  extremely  simple  asymptotic 
equivalence  by  showing  convergence  in  distribution  of  the  two  estimators. 
The  other  instrumental  variable  procedures  are  shown  to  be  one  step  of 
the  FIML  iteration,  and  therefore  if  they  are  iterated,  will  yield  the 
FIML  estimate. 

Exact  degrees  of  freedom  requirements  for  estimability  of  FIML  are 
calculated  and  are  weaker,  not  stronger  than  those  for  3SLS  as  Professor 
Klein  implied.   Thus  FIML  is  a  possible  estimator  when  3SLS,  2SLS,  LIML, 
and  other  k-class  estimators  cannot  be  used.   This  result  follows  since 
FIML  imposes  all  a  priori  restrictions  in  forming  the  instruments  while 
the  other  estimators  use  unrestricted  estimates  as  instruments.   Thus  the 
multicollinearity  problem  present  in  computing  3SLS  will  be  lessened  by 
using  the  a  priori  restrictions. 

Lastly,  FIML  and  two  new  estimators,  NL3SLS  and  non- linear  instru- 
mental variables,  are  developed  for  the  important  case  of  a  structural 
system  which  is  nonlinear  in  the  parameters.   All  three  procedures  require 
an  iterative  method  and  are  asymptotically  efficient.   Again,  FIML  re- 
quires the  most  computation  while  NL3SLS  does  not  impose  a  priori  re- 
strictions in  forming  the  instruments. 
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